home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Collection of Tools & Utilities
/
Collection of Tools and Utilities.iso
/
fortran
/
libry51.zip
/
LIBRY7.DOC
< prev
next >
Wrap
Text File
|
1989-11-10
|
7KB
|
131 lines
.pa
VECTOR EMULATIONS
Vector emulations are software procedures that mimic the operation of vector
processing hardware. Of course, the software is not based on the same
principle as the hardware; but the concept is the same: specific procedure
designed to most efficiently perform similar repetitive tasks on contiguously
stored real numbers. No, I won't tell you how I do it, so don't ask. My
vector emulations are completely compatible with Hewlett-Packard's Vector
Instruction Set (VIS). They have the same calling syntax and function (that's
why I developed them in the first place - downloading programs from an
HP-1000F). HP has a very nice manual with examples. If you are interested,
perhaps they would sell you one (I wouldn't even hazard a guess as to the
cost).
Vector Instruction Set (VIS) User's Manual
Part No. 12824-90001
Hewlett-Packard Company
Data Systems Division
11000 Wolfe Road
Cupertino, CA 95014
If you are using a PC you don't need a math coprocessor (Intel 8087/80287) in
order to run a program linked with LIBRY; but it makes a TREMENDOUS
difference (a factor of 120 or so for floating point operations). The vector
emulations will run even without a math coprocessor; but in that case the
speed is already so slow that nothing will help. The improvement in speed
with the vector emulations varies depending on the relative speed of your
processor and coprocessor (not MHz speed but MIPS and FLOPS - a 5MHz-80286 is
quite a bit faster than a 5MHz-8088 while a 5MHz-8087 is just as fast as a
5MHz 80287). The greatest improvement is realized on a PC with a
5MHz-8088/5MHz-8087 pair; and the least improvement is realized on an AT with
an 8MHz-80286/5MHz-80287 pair.
Note that the increments (INCR1,INCR2,INCR3),index (M), and the count (N) are
of the type INTEGER*2. Reals are of the type REAL*4 and double precision
reals are of the type REAL*8. There can be no mixing of REAL*4 and REAL*8
types in the same emulation. To get double precision use "CALL DVABS(...)"
rather than "CALL VABS(...)".
It is very important to BE SURE THAT NO VECTOR CROSSES A SEGMENT BOUNDARY
(refer to Microsoft FORTRAN manual section 8). What this means to the machine
is that a vector must reside within a single segment (65536 bytes) or it can
not address all of the elements as a group. In order to assure this to be the
case, NEVER use the $LARGE metacommand. If you have no COMMON then you never
have to worry about this. If you do have COMMON make sure that each COMMON
contains no more than 65536 bytes. Of course, you can have several named
COMMONs so this is not too restrictive a limit on your programs. Also, if
there is more than one vector passed to the emulator they need not reside in
the same segment. For instance, you can add one real vector with 16384
elements to another with 16384 elements and store the result in a third - as
long as they are all in different COMMONs. Of course, you can add two vectors
in the same COMMON provided their total number of elements does not exceed
16384. There is a way of getting around this; but it is too involved to
explain here.
A word of warning... vector emulations do not like being interrupted. This
is the whole point of "speed at any cost" procedures. For this reason, the
emulations may interfere with the operation of some "pop-up" programs and such
things as windowing and multitasking. This is regrettably unpredictable. I
can say that the emulations don't interfere with any of the "pop-up" programs
that I have developed (like my DOS command stack full-screen editor and
improved scroller) that "lurk" in the background; but I don't know about such
programs that others have developed.
.pa
SAMPLE FORTRAN EQUIVALENT OF A VECTOR ADD
SUBROUTINE VADD(V1,INCR1,V2,INCR2,V3,INCR3,N)
C
C VECTOR V3=V1+V2
C
IMPLICIT INTEGER*2(I-N),REAL*4(A-H,O-Z)
DIMENSION V1(N),V2(N),V3(N)
C
IF(N.LT.1) GO TO 999
I1=1
I2=1
I3=1
C
DO 100 I=1,N
V3(I3)=V1(I1)+V2(I2)
I1=I1+INCR1
I2=I2+INCR2
100 I3=I3+INCR3
C
999 RETURN
END
.pa
SUMMARY OF VECTOR INSTRUCTION SET
------------------------------------------------------------------------------
calling syntax operation
------------------------------------------------------------------------------
CALL VABS(V1,INCR1,V2,INCR2,N) (V2(I)=ABS(V1(I)),I=1,N)
CALL VADD(V1,INCR1,V2,INCR2,V3,INCR3,N) (V3(I)=V1(I)+V2(I),I=1,N)
CALL VDIV(V1,INCR1,V2,INCR2,V3,INCR3,N) (V3(I)=V1(I)/V2(I),I=1,N)
CALL VDOT(S,V1,INCR1,V2,INCR2,N) S=SUM(V1(I)*V2(I),I=1,N)
CALL VMAB(M,V1,INCR1,N) V1(M)=AMAX1(ABS(V1(I)),I=1,N)
CALL VMAX(M,V1,INCR1,N) V1(M)=AMAX1(V1(I),I=1,N)
CALL VMIB(M,V1,INCR1,N) V1(M)=AMIN1(ABS(V1(I)),I=1,N)
CALL VMIN(M,V1,INCR1,N) V1(M)=AMIN1(V1(I),I=1,N)
CALL VMOV(V1,INCR1,V2,INCR2,N) (V2(I)=V1(I),I=1,N)
CALL VMPY(V1,INCR1,V2,INCR2,V3,INCR3,N) (V3(I)=V1(I)*V2(I),I=1,N)
CALL VNRM(S,V1,INCR1,N) S=SUM(ABS(V1(I)),I=1,N)
CALL VPIV(S,V1,INCR1,V2,INCR2,V3,INCR3,N) (V3(I)=S*V1(I)+V2(I),I=1,N)
CALL VSAD(S,V1,INCR1,V2,INCR2,N) (V2(I)=S+V1(I),I=1,N)
CALL VSDV(S,V1,INCR1,V2,INCR2,N) (V2(I)=S/V1(I),I=1,N)
CALL VSMY(S,V1,INCR1,V2,INCR2,N) (V2(I)=S*V1(I),I=1,N)
CALL VSSB(S,V1,INCR1,V2,INCR2,N) (V2(I)=S-V1(I),I=1,N)
CALL VSUB(V1,INCR1,V2,INCR2,V3,INCR3,N) (V3(I)=V1(I)-V2(I),I=1,N)
CALL VSUM(S,V1,INCR1,N) S=SUM(V1(I),I=1,N)
CALL VSWP(V1,INCR1,V2,INCR2,N) (V1(I)<->V2(I),I=1,N)
CALL VMIX(INDEX,V1,V2,N) (V2(I)=V1(INDEX(I)),I=1,N)
CALL VMXI(INDEX,V1,V2,N) (V2(INDEX(I))=V1(I),I=1,N)
CALL CLAMP(VMIN,VMAX,V,N) (V1(I)=AMAX1(VMIN,AMIN1(VMAX,
V(I))),I=1,N)
H=HORNER(C,X,N) H=SUM(C(I)*X**(I-1),I=1,N)
------------------------------------------------------------------------------
Note 1: There is little or no improvement for n<5 and runtimes may increase
for n<3.
Note 2: For double precision add a "D" prefix (e.g. DVABS, DVADD, ...,
DCLAMP, DHORNE).
Note 3: Vectors must not cross a segment boundary (see section 8 of Microsoft
FORTRAN user's guide).
Note 4: All integers (e.g. INCR1,INCR2,INCR3,n...) are of the INTEGER*2
type.
Note 5: Increments (viz. INCR1,INCR2,INCR3) can be positive, negative, or
zero.
.ad LIBRY7A.DOC